95 research outputs found
Task-Agnostic Graph Neural Network Evaluation via Adversarial Collaboration
It has been increasingly demanding to develop reliable methods to evaluate
the progress of Graph Neural Network (GNN) research for molecular
representation learning. Existing GNN benchmarking methods for molecular
representation learning focus on comparing the GNNs' performances on some
node/graph classification/regression tasks on certain datasets. However, there
lacks a principled, task-agnostic method to directly compare two GNNs.
Additionally, most of the existing self-supervised learning works incorporate
handcrafted augmentations to the data, which has several severe difficulties to
be applied on graphs due to their unique characteristics. To address the
aforementioned issues, we propose GraphAC (Graph Adversarial Collaboration) --
a conceptually novel, principled, task-agnostic, and stable framework for
evaluating GNNs through contrastive self-supervision. We introduce a novel
objective function: the Competitive Barlow Twins, that allow two GNNs to
jointly update themselves from direct competitions against each other. GraphAC
succeeds in distinguishing GNNs of different expressiveness across various
aspects, and has demonstrated to be a principled and reliable GNN evaluation
method, without necessitating any augmentations.Comment: 11th International Conference on Learning Representations (ICLR 2023)
Machine Learning for Drug Discovery (MLDD) Workshop. 17 pages, 6 figures, 4
table
Augmentation Backdoors
Data augmentation is used extensively to improve model generalisation.
However, reliance on external libraries to implement augmentation methods
introduces a vulnerability into the machine learning pipeline. It is well known
that backdoors can be inserted into machine learning models through serving a
modified dataset to train on. Augmentation therefore presents a perfect
opportunity to perform this modification without requiring an initially
backdoored dataset. In this paper we present three backdoor attacks that can be
covertly inserted into data augmentation. Our attacks each insert a backdoor
using a different type of computer vision augmentation transform, covering
simple image transforms, GAN-based augmentation, and composition-based
augmentation. By inserting the backdoor using these augmentation transforms, we
make our backdoors difficult to detect, while still supporting arbitrary
backdoor functionality. We evaluate our attacks on a range of computer vision
benchmarks and demonstrate that an attacker is able to introduce backdoors
through just a malicious augmentation routine.Comment: 12 pages, 8 figure
To compress or not to compress: Understanding the Interactions between Adversarial Attacks and Neural Network Compression
As deep neural networks (DNNs) become widely used, pruned and quantised models are becoming ubiquitous on edge devices; such compressed DNNs are popular for lowering computational requirements.Meanwhile, recent studies show that adversarial samples can be effective at making DNNs misclassify. We, therefore, investigate the extent to which adversarial samples are transferable between uncompressed and compressed
DNNs. We find that adversarial samples remain transferable for both pruned and quantised models.For pruning, the adversarial samples generated from heavily pruned models remain effective on uncompressed models.
For quantisation, we find the transferability of adversarial samples is highly sensitive to integer precision.Partially supported with funds from Bosch-Forschungsstiftung im Stifterverban
Hybrid Graph: A Unified Graph Representation with Datasets and Benchmarks for Complex Graphs
Graphs are widely used to encapsulate a variety of data formats, but
real-world networks often involve complex node relations beyond only being
pairwise. While hypergraphs and hierarchical graphs have been developed and
employed to account for the complex node relations, they cannot fully represent
these complexities in practice. Additionally, though many Graph Neural Networks
(GNNs) have been proposed for representation learning on higher-order graphs,
they are usually only evaluated on simple graph datasets. Therefore, there is a
need for a unified modelling of higher-order graphs, and a collection of
comprehensive datasets with an accessible evaluation framework to fully
understand the performance of these algorithms on complex graphs. In this
paper, we introduce the concept of hybrid graphs, a unified definition for
higher-order graphs, and present the Hybrid Graph Benchmark (HGB). HGB contains
23 real-world hybrid graph datasets across various domains such as biology,
social media, and e-commerce. Furthermore, we provide an extensible evaluation
framework and a supporting codebase to facilitate the training and evaluation
of GNNs on HGB. Our empirical study of existing GNNs on HGB reveals various
research opportunities and gaps, including (1) evaluating the actual
performance improvement of hypergraph GNNs over simple graph GNNs; (2)
comparing the impact of different sampling strategies on hybrid graph learning
methods; and (3) exploring ways to integrate simple graph and hypergraph
information. We make our source code and full datasets publicly available at
https://zehui127.github.io/hybrid-graph-benchmark/.Comment: Preprint. Under review. 16 pages, 5 figures, 11 table
Revisiting Automated Prompting: Are We Actually Doing Better?
Current literature demonstrates that Large Language Models (LLMs) are great
few-shot learners, and prompting significantly increases their performance on a
range of downstream tasks in a few-shot learning setting. An attempt to
automate human-led prompting followed, with some progress achieved. In
particular, subsequent work demonstrates automation can outperform fine-tuning
in certain K-shot learning scenarios.
In this paper, we revisit techniques for automated prompting on six different
downstream tasks and a larger range of K-shot learning settings. We find that
automated prompting does not consistently outperform simple manual prompts. Our
work suggests that, in addition to fine-tuning, manual prompts should be used
as a baseline in this line of research
Wide Attention Is The Way Forward For Transformers?
The Transformer is an extremely powerful and prominent deep learning
architecture. In this work, we challenge the commonly held belief in deep
learning that going deeper is better, and show an alternative design approach
that is building wider attention Transformers. We demonstrate that wide single
layer Transformer models can compete with or outperform deeper ones in a
variety of Natural Language Processing (NLP) tasks when both are trained from
scratch. The impact of changing the model aspect ratio on Transformers is then
studied systematically. This ratio balances the number of layers and the number
of attention heads per layer while keeping the total number of attention heads
and all other hyperparameters constant. On average, across 4 NLP tasks and 10
attention types, single layer wide models perform 0.3% better than their deep
counterparts. We show an in-depth evaluation and demonstrate how wide models
require a far smaller memory footprint and can run faster on commodity
hardware, in addition, these wider models are also more interpretable. For
example, a single layer Transformer on the IMDb byte level text classification
has 3.1x faster inference latency on a CPU than its equally accurate deeper
counterpart, and is half the size. We therefore put forward wider and shallower
models as a viable and desirable alternative for small models on NLP tasks, and
as an important area of research for domains beyond this
- …